Corpus Annotation for Parser Evaluation

نویسندگان

  • John A. Carroll
  • Guido Minnen
  • Ted Briscoe
چکیده

We describe a recently developed corpus annotation scheme for evaluating parsers that avoids shortcomings of current methods. The scheme encodes grammatical relations between heads and dependents, and has been used to mark up a new publicdomain corpus of naturally occurring English text. We show how the corpus can be used to evaluate the accuracy of a robust parser, and relate the corpus to extant resources.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of Tree-bank Based Probabilistic Grammar for Urdu Language

The process includes in hand tagged corpus, tree annotation on paper for large corpus, NU-FAST Treebank in form of brackets, extraction of CFG through NU-FAST Treebank, evaluation of PCFG from CFG and then PDCG from PCFG for inspection/testing through PROLOG parser.

متن کامل

A Comparison of Evaluation Metrics for a Broad-Coverage Stochastic Parser

This paper reports on the use of two distinct evaluation metrics for assessing a stochastic parsing model consisting of a broad-coverage Lexical-Functional Grammar (LFG), an efficient constraint-based parser and a stochastic disambiguation model. The first evaluation metric measures matches of predicate-argument relations in LFG f-structures (henceforth the LFG annotation scheme) to a gold stan...

متن کامل

Arborest – a Growing Treebank of Estonian

Treebank creation is a very labor-consuming task, especially if the applications intended include machine learning, gold standard parser evaluation or teaching, since only a manually checked syntactically annotated corpus can provide optimal support for these purposes. There are, however, possibilities to make the annotation process (partly) automatic, saving (manual) annotation time and/or all...

متن کامل

Parser evaluation across text types

When a statistical parser is trained on one treebank, one usually tests it on another portion of the same treebank, partly due to the fact that a comparable annotation format is needed for testing. But the user of a parser may not be interested in parsing sentences from the same newspaper all over, or even wants syntactic annotations for a slightly different text type. Gildea (2001) for instanc...

متن کامل

Acknowledgement 3 Invited Speakers 5

The Survey Parser is a robust parsing system that applies a syntactically rich annotation scheme to natural texts. The parsing scheme is probably one of the most complex, explicitly indicating syntactic categories, their internal structures and also their syntactic functions. Since this parser is designed for the processing of large quantities of text, its evaluation needs to indicate its perfo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cs.CL/9907013  شماره 

صفحات  -

تاریخ انتشار 1999